Recognition using speech synthesis : a reactive dynamic for robust ASR

نویسنده

  • Hervé Glotin
چکیده

Automatic Speech Recognition (ASR) systems are not efficient under noisy speech. In the Multi-Stream (MS) approach, commonly used to reinforce ASR robustness, each stream feeds one recognizer generating estimates which are combined through a fusion process. As some streams are optimal for transmission of some phonemes [1,3], it is then interesting to over weight the best stream during the feature extraction and/or the fusion process [1,2]. Contrary to this forward weighting strategy we propose a new one based on a feedback loop from recognition to signal. The key idea is to use the current recognition to construct an Acoustic Image (α) which is compared to the input signal in order to calculate Estimates Accuracy (ρ). Therefore, for each frame t, ρ(t) is the correlation between the input signal Power Spectrum Density PSD(X(t)), and PSD(α(t)) which is the sum of E(PSD(K)), the average PSD of phoneme k (over the labelled 300,000 frames of the training set), weighted by the phoneme posteriors P(qk|X(t)). Therefore PSD(α(t)) = Σk [ P(qk|X(t)) . E(PSD(K)) ] and ρ(t) = Corr[ PSD(X(t)) , PSD(α(t)) ]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

A novel framework for noise robust ASR using cochlear implant-like spectrally reduced speech

We propose a novel framework for noise robust automatic speech recognition (ASR) based on cochlear implant-like spectrally reduced speech (SRS). Two experimental protocols (EPs) are proposed in order to clarify the advantage of using SRS for noise robust ASR. These two EPs assess the SRS in both the training and testing environments. Speech enhancement was used in one of two EPs to improve the ...

متن کامل

Distributed Speech Recognition By

Speech is a natural mode of communication for human beings and speech recognition is an application that enables the interaction between human and machine via voice. As the cost of software and hardware needed to do recognition decreases, automatic speech recognition (ASR) has entered the consumer product mainstream. A particularly interesting application is wireless speech recognition, which i...

متن کامل

بهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگی‌های استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز

The design for new feature extraction methods out of the speech signal and combination of their obtained information is one of the most effective approaches to improve the performance of automatic speech recognition (ASR) system. Recent researches have been shown that the speech signal contains nonlinear and chaotic properties, but the effects of these properties are not used in the continuous ...

متن کامل

Noise-robust ASR by Using Disti Approximated with Logarithmic No

Various approaches focused on noise-robustness have been investigated with the aim of using an automatic speech recognition (ASR) system in practical environments. We have previously proposed a distinctive phonetic feature (DPF) parameter set for a noise-robust ASR system, which reduced the effect of high-level additive noise[1]. This paper describes an attempt to replace normal distributions (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002